Distributed Activation, Search, and Learning by ART and ARTMAP Neural Networks

نویسنده

Gail A. Carpenter

چکیده

Adaptive resonance theory (ART) models are being used for learning and prediction in a wide variety of applications. Winner-take-all coding allows these networks to maintain stable memories, but this type of code representation can cause problems such as category proliferation with fast learning and a noisy training set. A new class of ART models overcomes this limitation, permitting code representations to be arbitrarily distributed. With winner-take-all coding, the unsupervised distributed ART model (dART) reduces to fuzzy ART and the supervised distributed ARTMAP model (dARTMAP) reduces to fuzzy ARTMAP. dART automatically apportions learned changes according to the degree of activation of each coding node, for fast as well as slow learning with compressed or distributed codes. Distributed ART models replace the traditional neural network path weight with a dynamic weight equal to the rectified difference between coding node activation and an adaptive threshold. Dynamic weights that project to coding nodes obey a distributed instar leaning law and those that originate from coding nodes obey a distributed outstar learning law. Inputs activate distributed codes through phasic and tonic signal components with dual computational properties, and a parallel distributed match-reset-search process helps stabilize memory. 1. ART, ARTMAP, and Distributed Learning ART [4,7] and ARTMAP [5,6] neural networks are being used for adaptive recognition and prediction in a variety of applications, including a Boeing parts design retrieval system, satellite remote sensing, medical database prediction, robot sensory-motor control and navigation, machine vision, 3D object recognition, electrocardiogram wave identification, automatic target recognition, air quality monitoring, signature verification, tool failure monitoring, chemical analysis from UV and IR spectra, electromagnetic system device design, and analysis of musical scores. The basic ART and ARTMAP networks feature winner-take-all (WTA) competitive coding, which groups inputs into discrete recognition categories. With fast learning but without WTA coding, certain input sequences may cause catastrophic forgetting of prior memories in these networks. Fa3 learning is useful for encoding important rare cases, but a combination of WTA coding and fast learning may lead to inefficient category proliferation with noisy training inputs. This problem is partially solved by ART-EMAP [9,10], which uses WTA coding for learning and distributed category representation for test-set prediction. Distributed test-set category representati~n can significantly improve ARTMAP performance, especially when the size of the training set is small. In medical database prediction problems, which often feature inconsistent training input predictions, the ARTMAP-IC [8] network improves ARTMAP performance with distributed prediction, category instance counting, and a new match tracking search algorithm. Compared to the original match tracking algorithm, the new rule facilitates prediction with sparse or inconsistent data, improves memory compression without loss of accuracy, and is actually a better approximation of the original ARTMAP network differential equations. A voting strategy further improves prediction by training the system several times on different orderings of an input set. Voting, instance counting, and distributed test-set code representations combine to form confidence estimates for competing predictions. However, these and most other ART and ARTMAP variants have used WTA coding during learning, so they do not solve the category proliferation problem of noisy training sets. A new class of ART models retain stable coding, recognition, and prediction, but allow arbitrarily distributed category representation during learning as well as performance [2]. When the category representation is winnertake-all, the unsupervised distributed ART model (dART) reduces to fuzzy ART [7] and the supervised distributed ARTMAP model (dARTMAP) reduces to fuzzy ARTMAP [5]. Distributed ART and ARTMAP networks automatically apportion learned changes according to the degree of activation of each category node. This research was supported in part by the National Science Foundation (NSF IRI 94-01659) and the Office of Naval Research (ONR N00014-95-1-0409 and ONR N00014-95-0657). In: Proceedings of the Internatiorial Conference orz Ne~iral Networks (ICNN'96), Washington DC. Distributed ART Gail A. Carpenter ICNN'96 CASJCNS-TR-96-006 2 This permits fast as well as slow learning without catastrophic forgetting. In distributed ART models, dynamic weights replace the multiplicative long-term memory weights found in most neural networks. The input signal that activates the distributed code is a function of a phasic component, which depends on the active input, and a tonic component, which depends on prior learning but is independent of the current input. The computational properties of the phasic and tonic components are derived from a formal analysis of distributed pattern learning. However, these components can be interpreted as postsynaptic membrane processes, with phasic terms mediated by ligand-gated receptors and tonic terms mediated by voltage-gated receptors [16]. At each synapse, phasic and tonic terms balance one another and exhibit dual computational properties. During learning with a constant input, phasic terms are constant while tonic terms may grow. Tonic components would then become larger for all inputs, but phasic components become more selective, reducing the total coding signal that would be sent by a significantly different input pattern. A geometric interpretation of distributed ART represents the tonic component as a coding box in input space and the phasic component as the coding box expanded to include the current input. Although dART with WTA coding is computationally equivalent to fuzzy ART, the dART architecture differs from the standard ART architecture. An ART input from a field Fo passes through a matching field FI before activating a coding field F2. Activity at F2 feeds back to F1, forming a resonant loop. ART networks thus encode matched F1 patterns rather than the Fo inputs themselves, a key feature for code stability. With winnertake-all coding, the matched F1 pattern confirms the original category choice when it feeds back up to F2. With F1 t, F2 feedback this essential property may not persist when the F2 code is distributed. In the distributed ART network, the coding field F2 receives input directly from Fo, retaining the bottom-up / top-down matching process at F1 only to determine whether an active code meets the vigilance matching criterion (Figure 1). Nevertheless, dART dynamic weights maintain code stability. When the matching process is disabled by setting the vigilance parameter to 0, dART becomes a type of feedforward ART network. 2. Distributed Activation A dART network includes a field of nodes Fo that represents a current input vector; a field F2 that represents the active code; and a field F1 that represents a matched pattern determined by bottom-up input from Fo and top-down input from F2. Vector I = ( 1 1 . .~ l i ...l M ) denotes Fo activity, x = ( X I ... x i ... x~ ) denotes F1 activity, and y r ( y l ...yj ...y~) denotes F2 activity. Each component of 1, x, and y is contained in the interval f3 1 CODING FUNCTION , q\o , ,u, 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed ARTMAP: a neural network for fast distributed supervised learning

Distributed coding at the hidden layer of a multi-layer perceptron (MLP) endows the network with memory compression and noise tolerance capabilities. However, an MLP typically requires slow off-line learning to avoid catastrophic forgetting in an open input environment. An adaptive resonance theory (ART) model is designed to guarantee stable memories even with fast on-line learning. However, AR...

متن کامل

dARTMAP: A Neural Network for Fast Distributed Supervised Learning

Distributed coding at tbe hidden layer of a multi-layer perceptron (MLP) endows the network with memory compression and noise tolerance capabilities. However, an MLP typically requires slow off-line learning to avoid catastrophic forgetting in an open input environment. An adaptive resonance theoty (ART) model is designed to guarantee stable memories even with fast online learning, However, ART...

متن کامل

ART Neural Networks for Medical Data Analysis and Fast Distributed Learning

ART (Adaptive Resonance Theory) neural networks for fast, stable learning and prediction have been applied in a variety of areas. Applications include airplane design and manufacturing, automatic target recognition, financial forecasting, machine tool monitoring, digital circuit design, chemical analysis, and robot vision. Supervised ART architectures, called ARTMAP systems, feature internal co...

متن کامل

Distributed ARTMAP - Neural Networks, 1999. IJCNN '99. International Joint Conference on

Distributed coding at the hidden layer of a multi-layer perceptron (&UP) endows the network with memory compression and noise tolerance capabilities. However, an MLP typically requires slow o&line learning to avoid catastrophic forgetting in an open input environment. An adaptive resonance theory (ART) model i3 designed to guarantee stable memories even with fast on-line learning. However, ART ...

متن کامل